31 Other methods for solving nonlinear equations in 1d
Here, I have summarised some of the answers for the following question on Assignment 5:
Find a research paper explaining a method named after one of the following people: Halley, Householder, Osada, Ostrowski, Steffensen, Traub. What is the main novelty of this method? How does it (claim to) improve upon previous methods in the literature? Implement your chosen method and test it on a function of your choice. Please clearly cite which paper you are describing.
31.1 Steffensen
Papers you cited: Steffensen (1933), Jain (2007), Jaiswal (2013), Cordero et al. (2012)
In Newton’s method, we replace the derivative f'(x_n) with the divided difference f[x_n, x_n + f(x_n)] = \frac{ f(x_n + f(x_n)) - f(x_n) }{(x_n + f(x_n)) - x_n} which gives
\begin{align*} x_{n+1} &= x_n - \frac{ f(x_n) }{ f[x_n, x_n + f(x_n)] } \\ % &= x_n - \frac{ f(x_n)^2 }{ f(x_n + f(x_n)) - f(x_n) } \end{align*}
- Under some conditions
[missing details], this method converges quadratically but now we do not need to compute the derivative of f which may be costly in practice. - The efficiency index is \sqrt{2} which is the same as Newton,
Steffensen's method applied to f(x) = x^3 - 2x^2 - 5 starting at x[0] = 2.5 ┌───────────┬─────────┬────────────────┬─────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 2) │ ├───────────┼─────────┼────────────────┼─────────┼────────────┤ │ 1.0 │ 2.5 │ 0.190647 │ NaN │ NaN │ │ 2.0 │ 3.46 │ 0.769353 │ 0.15821 │ 21.1672 │ │ 3.0 │ 3.41581 │ 0.725159 │ 1.22562 │ 1.22513 │ │ 4.0 │ 3.36955 │ 0.678903 │ 1.2051 │ 1.29105 │ │ 5.0 │ 3.32103 │ 0.630382 │ 1.19147 │ 1.36769 │ │ 6.0 │ 3.27002 │ 0.579368 │ 1.18288 │ 1.45796 │ │ 7.0 │ 3.21627 │ 0.525618 │ 1.17838 │ 1.56589 │ │ 8.0 │ 3.15953 │ 0.468885 │ 1.17758 │ 1.69718 │ │ 9.0 │ 3.0996 │ 0.408956 │ 1.18055 │ 1.86013 │ │ 10.0 │ 3.03637 │ 0.345725 │ 1.18785 │ 2.06717 │ │ 11.0 │ 2.97001 │ 0.279367 │ 1.20065 │ 2.3373 │ │ 12.0 │ 2.90137 │ 0.210718 │ 1.22114 │ 2.69992 │ │ 13.0 │ 2.83271 │ 0.142066 │ 1.25316 │ 3.19953 │ │ 14.0 │ 2.76925 │ 0.0785979 │ 1.30334 │ 3.8943 │ │ 15.0 │ 2.7204 │ 0.0297563 │ 1.38189 │ 4.81679 │ │ 16.0 │ 2.6958 │ 0.00515575 │ 1.49874 │ 5.82281 │ │ 17.0 │ 2.69082 │ 0.000172092 │ 1.64542 │ 6.47406 │ │ 18.0 │ 2.69065 │ 1.96084e-7 │ 1.78192 │ 6.62097 │ │ 19.0 │ 2.69065 │ 2.54907e-13 │ 1.87753 │ 6.62974 │ └───────────┴─────────┴────────────────┴─────────┴────────────┘
This method applied to Kepler’s equation (A5 Section B) with \epsilon = 0.9 converges cubically (same as Newton in this particular case):
Steffensen's method applied to f(ψ) = ψ - 0.9 sin(ψ) - 2π; starting at x[0] = 5.0 ┌───────────┬─────────┬────────────────┬───────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 3) │ ├───────────┼─────────┼────────────────┼───────────┼────────────┤ │ 1.0 │ 5.0 │ 1.28319 │ NaN │ NaN │ │ 2.0 │ 5.45139 │ 0.831796 │ -0.738606 │ 0.393685 │ │ 3.0 │ 5.82 │ 0.463185 │ 4.17895 │ 0.804829 │ │ 4.0 │ 6.11414 │ 0.169042 │ 2.3097 │ 1.7011 │ │ 5.0 │ 6.26849 │ 0.0146953 │ 2.3741 │ 3.04227 │ │ 6.0 │ 6.28317 │ 1.09845e-5 │ 2.70578 │ 3.46136 │ │ 7.0 │ 6.28319 │ 1.77636e-15 │ 2.97435 │ 1.34025 │ └───────────┴─────────┴────────────────┴───────────┴────────────┘
31.1.1 Variants of Steffensen:
Papers you cited: Hafiz and Badawi (2013), Eskandari (2022),
Another way of doing Steffensen is the backwards difference formula:
\begin{align} x_{n+1} &= x_n - \frac{ f(x_n)^2 }{ f[x_n, x_n - f(x_n)] } \\ % &= x_n - \frac{ f(x_n)^2 }{ f(x_n) - f(x_n - f(x_n)) } \end{align}
Or the central formula:
\begin{align} x_{n+1} &= x_n - \frac{f(x_n)}{ f[x_n - f(x_n), x_n + f(x_n)] } \\ % &= x_n - \frac{2 f(x_n)^2}{ f(x_n + f(x_n)) - f(x_n - f(x_n)) } \end{align}
Or one could use a convex combination: for \theta \in [0,1],
\begin{align} x_{n+1} = &\theta \left( x_n - \frac{ f(x_n)^2 }{ f(x_n + f(x_n)) - f(x_n) } \right) \\ &+ (1-\theta) \left( x_n - \frac{ f(x_n)^2 }{ f(x_n) - f(x_n -f(x_n) ) } \right) \end{align}
Or one could change the step size in the standard Steffensen’s method: for \tau \in \mathbb R, consider
\begin{align} x_{n+1} &= x_n - \frac{ f(x_n) }{ f[ x_n, x_n + \tau f(x_n) ] } \\ % &= x_n - \frac{ \tau f(x_n) }{ f(x_n + \tau f(x_n)) - f(x_n) } \end{align}
31.2 Halley
Papers you cited: Davies and Dawson (1975), Brown (1977), Alefeld (1981), Gander (1985), Scavo and Thoo (1995), Proinov and Ivanov (2014), Gnang and Dubeau (2018), Barrada et al. (2020)
Suppose f is twice continuously differentiable and consider:
\begin{align} x_{n+1} = x_n - \frac{ f(x_n)f'(x_n) }{ f'(x_n)^2-\frac{1}{2} f(x_n) f''(x_n) } \end{align}
Under some conditions [missing details], this method converges cubically.
Halley's method applied to f(x) = log x starting at x[0] = 5.0 ┌───────────┬──────────┬────────────────┬───────────┬──────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 3.0) │ ├───────────┼──────────┼────────────────┼───────────┼──────────────┤ │ 1.0 │ 5.0 │ 4.0 │ NaN │ NaN │ │ 2.0 │ 0.541029 │ 0.458971 │ -0.561762 │ 0.00717142 │ │ 3.0 │ 1.0207 │ 0.0207005 │ 4.97914 │ 0.214104 │ │ 4.0 │ 0.999999 │ 7.1683e-7 │ 3.64876 │ 0.080812 │ │ 5.0 │ 1.0 │ 0.0 │ Inf │ 0.0 │ └───────────┴──────────┴────────────────┴───────────┴──────────────┘
For the following function, Newton converged linearly with asymptotic error constant \mu = \frac{2}3:
Halley's method applied to f(ψ) = ψ - sin(ψ) - 2π; starting at x[0] = 6.0 ┌───────────┬─────────┬────────────────┬─────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 1) │ ├───────────┼─────────┼────────────────┼─────────┼────────────┤ │ 1.0 │ 6.0 │ 0.283185 │ NaN │ NaN │ │ 2.0 │ 6.14169 │ 0.141498 │ 1.54992 │ 0.499667 │ │ 3.0 │ 6.21245 │ 0.0707375 │ 1.35455 │ 0.499917 │ │ 4.0 │ 6.24782 │ 0.0353673 │ 1.2617 │ 0.499979 │ │ 5.0 │ 6.2655 │ 0.0176834 │ 1.20741 │ 0.499995 │ │ 6.0 │ 6.27434 │ 0.0088417 │ 1.17178 │ 0.499999 │ │ 7.0 │ 6.27876 │ 0.00442085 │ 1.1466 │ 0.5 │ └───────────┴─────────┴────────────────┴─────────┴────────────┘
31.3 Osada
Paper you used: Osada (1994)
Suppose f:\mathbb R \to \mathbb R has a root of multiplicity m \geq 2 and define
\begin{align} x_{n+1} = x_n - \frac12 m (m+1) \frac{ f(x_n) }{ f'(x_n) } + \frac12 (m-1)^2 \frac{f'(x_n)}{ f''(x_n) } \end{align}
- For f(x) = (x - \xi)^m this iteration converges in one step since
\begin{align} &x - \frac12 m (m+1) \frac{ (x-\xi)^m }{ m (x-\xi)^{m-1} } + \frac12 (m-1)^2 \frac{ m (x - \xi)^{m-1} }{ m(m-1) (x - \xi)^{m-2} } \\ % &= x - \frac12 (m+1) (x-\xi) + \frac12 (m-1) (x - \xi) = \xi\\ % \end{align}
- Under some conditions
[details missing], Osada’s method converges cubically,
Osada's method applied to f(x) = exp( 2(x-1) ) (x-1)^3 (here, m = 3) starting at x[0] = 0.0 ┌───────────┬─────────┬────────────────┬───────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 3) │ ├───────────┼─────────┼────────────────┼───────────┼────────────┤ │ 1.0 │ 0.0 │ 1.0 │ NaN │ NaN │ │ 2.0 │ 7.0 │ 6.0 │ Inf │ 6.0 │ │ 3.0 │ 5.41081 │ 4.41081 │ 0.828269 │ 0.0204204 │ │ 4.0 │ 3.93473 │ 2.93473 │ 0.725453 │ 0.0341989 │ │ 5.0 │ 2.63744 │ 1.63744 │ 0.458043 │ 0.0647833 │ │ 6.0 │ 1.63668 │ 0.63668 │ -0.915547 │ 0.145018 │ │ 7.0 │ 1.0993 │ 0.099301 │ 5.11552 │ 0.384761 │ │ 8.0 │ 1.00088 │ 0.00088033 │ 3.04607 │ 0.899052 │ │ 9.0 │ 1.0 │ 7.56534e-10 │ 2.98531 │ 1.1089 │ └───────────┴─────────┴────────────────┴───────────┴────────────┘
Osada's method applied to f(ψ) = ψ - sin(ψ) - 2π (here, m = 3) starting at x[0] = 6 ┌───────────┬─────────┬────────────────┬─────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 3) │ ├───────────┼─────────┼────────────────┼─────────┼────────────┤ │ 1.0 │ 6.0 │ 0.283185 │ NaN │ NaN │ │ 2.0 │ 6.2828 │ 0.000389449 │ 6.22261 │ 0.017149 │ │ 3.0 │ 6.28319 │ 4.24732e-9 │ 2.45542 │ 71.9059 │ └───────────┴─────────┴────────────────┴─────────┴────────────┘
31.4 Traub
Paper you used: Saeed K et al. (2023)
We shall consider the following method:
\begin{align} y_n &= x_n - \frac{ f(x_n) }{ f'(x_n) } \\ % x_{n+1} &= x_n - \frac{ f(x_n) }{ \frac{1}{2} [ f'(x_n) + f'(y_n) ] } \end{align}
Under some conditions [details missing], this method has cubic order of convergence.
Applied to the function f(x) = x^3 - 8, Newton’s method converges quadratically
Newton's method applied to f(x) = x^3 - 8 starting at x[0] = 100.0 ┌───────────┬─────────┬────────────────┬───────────┬────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 2) │ ├───────────┼─────────┼────────────────┼───────────┼────────────┤ │ 1.0 │ 100.0 │ 98.0 │ NaN │ NaN │ │ 2.0 │ 66.6669 │ 64.6669 │ 0.90933 │ 0.00673333 │ │ 3.0 │ 44.4452 │ 42.4452 │ 0.899014 │ 0.01015 │ │ 4.0 │ 29.6315 │ 27.6315 │ 0.885477 │ 0.0153372 │ │ 5.0 │ 19.7574 │ 17.7574 │ 0.866779 │ 0.0232579 │ │ 6.0 │ 13.1784 │ 11.1784 │ 0.839121 │ 0.0354505 │ │ 7.0 │ 8.80096 │ 6.80096 │ 0.794149 │ 0.0544265 │ │ 8.0 │ 5.90174 │ 3.90174 │ 0.71016 │ 0.0843562 │ │ 9.0 │ 4.01105 │ 2.01105 │ 0.513183 │ 0.132101 │ │ 10.0 │ 2.83978 │ 0.839784 │ -0.249923 │ 0.207645 │ │ 11.0 │ 2.22386 │ 0.223862 │ 8.5718 │ 0.317428 │ │ 12.0 │ 2.02178 │ 0.0217786 │ 2.5568 │ 0.43458 │ │ 13.0 │ 2.00023 │ 0.000233757 │ 2.1849 │ 0.492838 │ │ 14.0 │ 2.0 │ 2.73168e-8 │ 2.08292 │ 0.499922 │ │ 15.0 │ 2.0 │ 4.44089e-16 │ 2.0298 │ 0.595128 │ └───────────┴─────────┴────────────────┴───────────┴────────────┘
whereas Traub’s method converges cubically
Traub's method applied to f(x) = x^3 - 8 starting at x[0] = 100.0 ┌───────────┬─────────┬────────────────┬───────────┬─────────────┐ │ iteration │ x[n] │ absolute error │ alpha │ mu (α = 3) │ ├───────────┼─────────┼────────────────┼───────────┼─────────────┤ │ 1.0 │ 100.0 │ 98.0 │ NaN │ NaN │ │ 2.0 │ 53.8466 │ 51.8466 │ 0.861138 │ 5.50861e-5 │ │ 3.0 │ 28.996 │ 26.996 │ 0.834713 │ 0.000193704 │ │ 4.0 │ 15.619 │ 13.619 │ 0.792388 │ 0.000692223 │ │ 5.0 │ 8.43 │ 6.43 │ 0.712617 │ 0.00254553 │ │ 6.0 │ 4.60695 │ 2.60695 │ 0.514881 │ 0.00980617 │ │ 7.0 │ 2.70353 │ 0.703532 │ -0.366989 │ 0.0397087 │ │ 8.0 │ 2.0505 │ 0.0504967 │ 8.49115 │ 0.145015 │ │ 9.0 │ 2.00004 │ 3.55831e-5 │ 3.43073 │ 0.276347 │ │ 10.0 │ 2.0 │ 1.33227e-14 │ 3.11894 │ 0.295707 │ └───────────┴─────────┴────────────────┴───────────┴─────────────┘
31.5 Ostrowsky
See Ostrowsky for more details.
Paper you used: Postigo (2023)
\begin{align} y_n &= x_n - \frac{f(x_n)}{f'(x_n)} \\ x_{n+1} &= y_n - \frac{ f(y_n) }{ f'(y_n) } \frac{ f(x_n) }{ f(x_n) - 2f(y_n) } \end{align}
- Suppose f(\xi) = 0 and f is three times continuously differentiable with f'(\xi) \not= 0. Then, if |x_1 - \xi| is sufficiently small, Ostrowsky’s method converges quartically (with order 4).
- The efficiency index of Newton is 2^{\frac12} \approx 1.41..., whereas the efficiency index of Ostrowsky is 4^{\frac13} \approx 1.59...
31.6 Householder
Paper used: Foo and Tan (2025)
Fix d \in \mathbb N. Suppose f is d+1 times continuously differentiable and define
\begin{align*} x_{n+1} = x_n + d \frac { (1/f)^{(d-1)}(x_n) } { (1/f)^{(d)}(x_n) }. \end{align*}
Under some conditions [missing details], this method converges with order of convergence d+1.
Remark.
When d = 1, we have
\begin{align*} d \frac { (1/f)^{(d-1)}(x_n) } { (1/f)^{(d)}(x_n) } % &= 1 \frac{ (1/f)(x_n) }{ -(f'/f^2)(x_n) } \\ % &= - \frac{f(x_n)}{f'(x_n)} \end{align*}
and so Householder with d = 1 is the same as Newton’s method.
Remark.
When d = 2, we have
\begin{align*} d \frac { (1/f)^{(d-1)}(x_n) } { (1/f)^{(d)}(x_n) } % &= 2 \frac{ (-f'/f^2)(x_n) }{ ((-f''f +2 (f')^2 )/f^3)(x_n) } \\ % &= -2 \frac{ (ff')(x_n) }{ (-f''f +2 (f')^2 )(x_n) } \\ % &= - \frac{ f(x_n)f'(x_n) }{ f'(x_n)^2-\frac{1}{2} f(x_n) f''(x_n) } \end{align*}
and so Householder with d = 2 is the same as Halley’s method.